Run, skeleton, run: skeletal model in a physics-based simulation

نویسندگان

  • Mikhail Pavlov
  • Sergey Kolesnikov
  • Sergey M. Plis
چکیده

In this paper, we present our approach to solve a physics-based reinforcement learning challenge “Learning to Run” with objective to train physiologically-based human model to navigate a complex obstacle course as quickly as possible. The environment is computationally expensive, has a highdimensional continuous action space and is stochastic. We benchmark state of the art policy-gradient methods and test several improvements, such as layer normalization, parameter noise, action and state reflecting, to stabilize training and improve its sample-efficiency. We found that the Deep Deterministic Policy Gradient method is the most efficient method for this environment and the improvements we have introduced help to stabilize training. Learned models are able to generalize to new physical scenarios, e.g. different obstacle courses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Numerical modeling of wave run-up along columns of semi-submersible platforms

Wave run-up is one of the most important and effective parameters in designing semi-submersible platforms. Besides unforeseen effects on the movements and response of the platform, wave run up can also cause slamming forces to be exerted on the lower deck of the platform. Therefore, at the first stages of this plan, before running tests on the model of the platform, numerical methods are usuall...

متن کامل

A two-sided Bernoulli-based CUSUM control chart with autocorrelated observations

Usually, in monitoring a proportion p < /em>, the binary observations are considered independent; however, in many real cases, there is a continuous stream of autocorrelated binary observations in which a two-state Markov chain model is applied with first-order dependence. On the other hand, the Bernoulli CUSUM control chart which is not robust to autocorrelation can be applied two-sided co...

متن کامل

Dynamic Simulation of CNTFET-Based Digital Circuits

   In this paper we propose a simulation study to carry out dynamic analysis of CNTFET-based digital circuit, introducing in the semi-empirical compact model for CNTFETs, already proposed by us, both the quantum capacitance effects and the sub-threshold currents. To verify the validity of the obtained results, a comparison with Wong model was carried out. Our mode...

متن کامل

A New Optimization Model for Designing Acceptance Sampling Plan Based on Run Length of Conforming Items

The purpose of this article is to present an optimization model for designing an acceptance sampling plan based on cumulative sum of run length of conforming items. The objective is to minimize the total loss including both the producer and consumer losses. The concept of minimum angle method is applied to consider producer and consumer risks in the optimization model. Also the average number o...

متن کامل

The Role of Institutions in the Dynamic Effects of Oil Revenues in Oil Economies

The purpose of this paper is to investigate the system of oil revenues effects on the production performance of oil-rich countries in both short and long-run. To reveal new insight, a macroeconomic model is designed to hypothesize long-run structural relations in the economies of the oil-rich countries including three long-run relationships of real output, real money balance, and the adjusted p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.06922  شماره 

صفحات  -

تاریخ انتشار 2017